07. More Black-Box Optimization
More Black-Box Optimization
All of the algorithms that you’ve learned about in this lesson can be classified as black-box optimization techniques.
Black-box refers to the fact that in order to find the value of \theta that maximizes the function J = J(\theta), we need only be able to estimate the value of J at any potential value of \theta.
That is, both hill climbing and steepest ascent hill climbing don't know that we're solving a reinforcement learning problem, and they do not care that the function we're trying to maximize corresponds to the expected return.
These algorithms only know that for each value of \theta, there's a corresponding number. We know that this number corresponds to the return obtained by using the policy corresponding to \theta to collect an episode, but the algorithms are not aware of this. To the algorithms, the way we evaluate \theta is considered a black box, and they don't worry about the details. The algorithms only care about finding the value of \theta that will maximize the number that comes out of the black box.
In the video below, you'll learn about a couple more black-box optimization techniques, to include the cross-entropy method and evolution strategies.
## Video
M3 L2 C07 V3